Combining Semantics and Social Knowledge for News Article Summarization
نویسندگان
چکیده
With the diffusion of online newspapers and social media, users are becoming capable of retrieving dozens of news articles covering the same topic in a short time. News article summarization is the task of automatically selecting a worthwhile subset of news’ sentences that users could easily explore. Promising research directions in this field are the use of semantics-based models (e.g., ontologies and taxonomies) to identify key document topics and the integration of social data analysis to also consider the current user’s interests during summary generation. The chapter overviews the most recent research advances in document summarization and presents a novel strategy to combine ontology-based and social knowledge for addressing the problem of generic (not query-based) multi-document summarization of news articles. To identify the most salient news articles’ sentences, an ontology-based text analysis is performed during the summarization process. Furthermore, the social content acquired from real Twitter messages is separately analyzed to also consider the current interests of social network users for sentence evaluation. The combination of ontological and social knowledge allows the generation of accurate and easy-to-read news summaries. Moreover, the proposed summarizer performs better than the evaluated competitors on real news articles and Twitter messages. Elena Baralis Politecnico di Torino, Italy Luca Cagliero Politecnico di Torino, Italy Saima Jabeen Politecnico di Torino, Italy Alessandro Fiori Institute for Cancer Research at Candiolo (IRCC), Italy Sajid Shah Politecnico di Torino, Italy
منابع مشابه
Combining Syntax and Semantics for Automatic Extractive Single-Document Summarization
The goal of automated summarization is to tackle the “information overload” problem by extracting and perhaps compressing the most important content of a document. Due to the difficulty that singledocument summarization has in beating a standard baseline, especially for news articles, most efforts are currently focused on multi-document summarization. The goal of this study is to reconsider the...
متن کاملUsing Relevant Public Posts to Enhance News Article Summarization
A news article summary usually consists of 2-3 key sentences that reflect the gist of that news article. In this paper we explore using public posts following a new article to improve automatic summary generation for the news article. We propose different approaches to incorporate information from public posts, including using frequency information from the posts to re-estimate bigram weights i...
متن کاملOntology-based fuzzy event extraction agent for Chinese e-news summarization
An Ontology-based Fuzzy Event Extraction (OFEE) agent for Chinese e-news summarization is proposed in this article. The OFEE agent contains Retrieval Agent (RA), Document Processing Agent (DPA) and Fuzzy Inference Agent (FIA) to perform the event extraction for Chinese e-news summarization. First, RA automatically retrieves Internet e-news periodically, stores them into the e-news repository, a...
متن کاملAutoCAP: An Automatic Caption Generation System based on the Text Knowledge Power Series Representation Model
This paper describes Automatic Caption generation for news Articles, it is an experimental intelligent system that generates presentations in text based on the text knowledge power series representation model. Captions or titles are useful for users who only need information on the main topics of an article. Using current extractive summarization techniques, it is not able to generate a coheren...
متن کاملCategorization of Narrative Semantics for Use in Generative Multidocument Summarization
The generative summarization of textual stories has been one of the goals of computational narratology since attempts at full semantic NLU in the ’70s. Our NLP group has recently created several systems for multidocument news summarization using purely statistical methods. Between these poles, there may be an unexplored avenue where knowledge of story structure can give partial, yet useful sema...
متن کامل